Methods of producing ethanol using hydrolytic enzyme mixtures for saccharification of lignocellulosic polysaccharides

ABSTRACT

The present invention relates to cell wall degradative systems, in particular to systems containing enzymes that bind to and/or depolymerize cellulose. These systems have a number of applications. Some embodiments relate to a method of producing ethanol using the cell wall degradative systems of the present invention.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.11/519,104, filed Sep. 12, 2006, which is a continuation-in-part of U.S.application Ser. No. 11/121,154 filed May 4, 2005, issued as U.S. Pat.No. 7,365,180, on Apr. 29, 2008 and claims priority to U.S. ProvisionalPatent Application No. 60/567,971, filed May 4, 2004, the contents ofwhich are incorporated herein, in their entirety, by reference.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

This invention was made with government support under Contract NumberSA7528051E awarded by the National Oceanic and AtmosphericAdministration (NOAA) and Contract Number DEB0109869 awarded by theNational Science Foundation (NSF). The government has certain rights inthe invention.

SEQUENCE LISTING

The present application contains a lengthy Sequence Listing, which hasbeen submitted via triplicate CD-R in lieu of a printed paper copy, andis hereby incorporated by reference in its entirety. The CD-Rs, recordedon Sep. 14, 2005 in related U.S. application Ser. No. 11/121,154 filedMay 4, 2005, are labeled “CRF”, “Copy 1,” and “Copy 2,” respectively,and each contains only one identical 828 KB file (18172121.APP).

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention is generally directed to degradative enzymes and systems.In particular, the present invention is directed to plant cell walldegrading enzymes and associated proteins found in Microbulbiferdegradans, systems containing such enzymes and/or proteins, and methodsof using the systems to obtain ethanol.

2. Background of the Invention:

Cellulases and related enzymes have been utilized in food, beer, wine,animal feeds, textile production and laundering, pulp and paperindustry, and agricultural industries. Various such uses are describedin the paper “Cellulases and related enzymes in biotechnology” by M. K.Bhat (Biotechnical Advances 18 (2000) 355-383), the subject matter ofwhich is hereby incorporated by reference in its entirety.

The cell walls of plants are composed of a heterogenous mixture ofcomplex polysaccharides that interact through covalent and noncovalentmeans. Complex polysaccharides of higher plant cell walls include, forexample, cellulose (β-1, 4 glucan) which generally makes up 35-50% ofcarbon found in cell wall components. Cellulose polymers self associatethrough hydrogen bonding, van der Waals interactions and hydrophobicinteractions to form semi-crystalline cellulose microfibrils. Thesemicrofibrils also include noncrystalline regions, generally known asamorphous cellulose. The cellulose microfibrils are embedded in a matrixformed of hemicelluloses (including, e.g., xylans, arabinans, andmannans), pectins (e.g., galacturonans and galactans), and various otherβ-1, 3 and β-1, 4 glucans. These matrix polymers are often substitutedwith, for example, arabinose, galactose and/or xylose residues to yieldhighly complex arabinoxylans, arabinogalactans, galactomannans, andxyloglucans. The hemicellulose matrix is, in turn, surrounded bypolyphenolic lignin.

The complexity of the matrix makes it difficult to degrade bymicroorganisms as lignin and hemicellulose components must be degradedbefore enzymes can act on the core cellulose microfibrils. Ordinarily, aconsortium of different microorganisms is required to degrade cell wallpolymers to release the constituent monosaccharides. Forsaccharification of plant cell walls, the lignin must be permeabilizedand hemicellulose removed to allow cellulose-degrading enzymes to act ontheir substrate. For industrial saccharification of cell walls, largeamounts of primarily fungal cellulases are added to processed feedstockthat has been treated with dilute sulfuric acid at high temperature andpressure to permeabilize the lignin and partially saccharify thehemicellulose constituents.

Saccharophagus degradans strain 2-40 (herein referred to as “S.degradans 2-40” or “2-40”) is a representative of an emerging group ofmarine bacteria that degrade complex polysaccharides (CP). S. degradanshas been deposited at the American Type Culture Collection and bearsaccession number ATCC 43961. S. degradans 2-40, formerly known andreferred to synonymously herein as Microbulbifer degradans strain 2-40(“M. degradans 2-40”), is a marine □-proteobacterium that was isolatedfrom decaying Sparina alterniflora, a salt marsh cord grass in theChesapeake Bay watershed. Consistent with its isolation from decayingplant matter, S. degradans strain 2-40 is able to degrade many complexpolysaccharides, including cellulose, pectin, xylan, and chitin, whichare common components of the cell walls of higher plants. S. degradansstrain 2-40 is also able to depolymerize algal cell wall components,such as agar, agarose, and laminarin, as well as protein, starch,pullulan, and alginic acid. In addition to degrading this plethora ofpolymers, S. degradans strain 2-40 can utilize each of thepolysaccharides as the sole carbon source. Therefore, S. degradansstrain 2-40 is not only an excellent model of microbial degradation ofinsoluble complex polysaccharides (ICPs) but can also be used as aparadigm for complete metabolism of these ICPs. ICPs are polymerizedsaccharides that are used for form and structure in animals and plants.They are insoluble in water and therefore are difficult to break down.

Microbulbifer degradans strain 2-40 requires at least 1% sea salts forgrowth and will tolerate salt concentrations as high as 10%. It is ahighly pleomorphic, Gram-negative bacterium that is aerobic, generallyrod-shaped, and motile by means of a single polar flagellum. Previouswork has determined that 2-40 can degrade at least 10 differentcarbohydrate polymers (CP), including agar, chitin, alginic acid,carboxymethylcellulose (CMC), β-glucan, laminarin, pectin, pullulan,starch and xylan (Ensor, Stotz et al. 1999). In addition, it has beenshown to synthesize a true tyrosinase (Kelley, Coyne et al. 1990). 16SrDNA analysis shows that 2-40 is a member of the gamma-subclass of thephylum Proteobacteria, related to Microbulbifer hydrolyticus (Gonzalezand Weiner 2000) and to Teridinibacter sp., (Distel, Morrill et al.2002) cellulolytic nitrogen-fixing bacteria that are symbionts ofshipworms.

The agarase, chitinase and alginase systems have been generallycharacterized. Zymogram activity gels indicate that all three systemsare comprised of multiple depolymerases and multiple lines of evidencesuggest that at least some of these depolymerases are attached to thecell surface (Stotz 1994; Whitehead 1997; Chakravorty 1998). Activityassays reveal that the majority of 2-40 enzyme activity resides with thecell fraction during logarithmic growth on CP, while in later growthphases the bulk of the activity is found in the supernatant andcell-bound activity decreases dramatically (Stotz 1994). Growth on CP isalso accompanied by dramatic alterations in cell morphology.Glucose-grown cultures of 2-40 are relatively uniform in cell size andshape, with generally smooth and featureless cell surfaces. However,when grown on agarose, alginate, or chitin, 2-40 cells exhibit novelsurface structures and features.

These exo- and extra-cellular structures (ES) include smallprotuberances, larger bleb-like structures that appear to be releasedfrom the cell, fine fimbrae or pili, and a network of fibril-likeappendages which may be tubules of some kind. Immunoelectron microscopyhas shown that agarases, alginases and/or chitinases are localized in atleast some types of 2-40 ES. The surface topology and pattern ofimmunolocalization of 2-40 enzymes to surface protuberances are verysimilar to what is seen with cellulolytic members of the genusClostridium.

The oldest methods studied to convert lignocellulosic materials tosaccharides are based on acid hydrolysis (see, e.g., review byGrethlein, Chemical Breakdown Of Cellulosic Materials, J. APPL. CHEM.BIOTECHNOL. 28:296-308 (1978)). This process can involve the use ofconcentrated or dilute acids. For example, U.S. Pat. Nos. 5,221,537 and5,536,325, incorporated by reference herein in their entireties,describe a two-step process for the acid hydrolysis of lignocellulosicmaterial to glucose. These processes have numerous disadvantagesincluding, for example, recovery of the acid, the specialized materialsof construction required, the need to minimize water in the system, andthe high production of degradation products which can inhibit thefermentation to ethanol.

To overcome the problems of the acid hydrolysis process, celluloseconversion processes are being developed using enzymatic hydrolysis.See, for example, U.S. Pat. No. 5,916,780, incorporated by referenceherein in its entirety, which discloses enzymatic hydrolysis with apre-treatment step to break down the integrity of the fiber structureand make the cellulose more accessible to attack by cellulase enzymes inthe treatment phase.

U.S. Pat. No. 6,333,181, incorporated by reference herein in itsentirety, discloses production of ethanol from lignocellulosic materialby treatment of a mixture of lignocellulose, cellulose, and anethanologenic microorganism with ultrasound.

There exists a need to identify enzyme systems that use cellulose as asubstrate, express the genes encoding the proteins using suitablevectors, identify and isolate the amino acid products (enzymes andnon-enzymatic products), and use these products as well as organismscontaining these genes for purposes, such as the production of ethanoland uses described in the Bhat paper. There is also a need in the art ofusing lignocellulosic materials for production of ethanol, to developmore effective treatment methods that result in greater yields ofethanol.

SUMMARY OF THE INVENTION

One aspect of the present invention is directed to systems of plant wallactive carbohydrases and related proteins.

A further aspect of the invention is directed to a method for thedegradation of substances comprising cellulose. The method involvescontacting the cellulose containing substances with one or morecompounds obtained from Saccharophagus degradans strain 2-40.

Another aspect of the present invention is directed to groups of enzymesthat catalyze reactions involving cellulose.

Another aspect of the present invention is directed to polynucleotidesthat encode polypeptides with cellulose degrading or cellulose bindingactivity.

A further aspect of the invention is directed to chimeric genes andvectors comprising genes that encode polypeptides with cellulosedepolymerase activity.

A further aspect of the invention is directed to a method for theidentification of a nucleotide sequence encoding a polypeptidecomprising any one of the following activities from S. degradans:cellulose depolymerase, or cellulose binding. An S. degradans genomiclibrary can be constructed in E. coli and screened for the desiredactivity. Transformed E. coli cells with specific activity are createdand isolated.

Another aspect of the invention is directed to a method for producingethanol from lignocellulosic material, comprising treatinglignocellulosic material with an effective saccharifying amount of oneor more compounds listed in FIGS. 4-11 to obtain saccharides andconverting the saccharides to produce ethanol. Conversion of sugars toethanol and recovery may be accomplished by, but are not limited to, anyof the well-established methods known to those of skill in the art. Forexample, through the use of an ethanologenic microorganism, such asZymomonas, Erwinia, Klebsiella, Xanthomonas, and Escherichia, preferablyEscherichia coil K011 and Klebsiella oxytoca P2.

A further aspect of the invention is directed to a method for producingethanol from lignocellulosic material, comprising contactinglignocellulosic material with a microorganism expressing an effectivesaccharifying amount of one or more compounds listed in FIGS. 4-11 toobtain saccharides and converting the saccharides to produce ethanol.

A further aspect of the invention is directed to a method for producingethanol from lignocellulosic material, comprising contactinglignocellulosic material with an ethanologenic microorganism expressingan effective saccharifying amount of one or more compounds listed inFIGS. 4-11 to produce ethanol. Such an ethanologenic microorganismexpresses an effective amount of one or more compounds listed in FIGS.4-11 to saccharify the lignocellulosic material and an effective amountof one or more enzymes or enzyme systems which, in turn, catalyze(individually or in concert) the conversion of the saccharides toethanol.

Further aspects of the invention are directed to utilization of thecellulose degrading substances in food, beer, wine, animal feeds,textile production and laundering, pulp and paper industry, andagricultural industries.

The present invention is advantageous in that saccharification of plantcell walls and ethanol production processes including saccharificationmay be obtained without permeabilizing lignin and/or removing orpartially saccharifying the hemicellulose or hemicellulose constituentsbefore the cellulose-degrading enzymes can act on their substrate. Thepresent invention also allows for saccharification and ethanolproduction processes including saccharification without or with areduced amount of fungal cellulases, acids (e.g., sulfuric acid), hightemperatures, and high pressures in the saccharification process.

Other aspects, features, and advantages of the invention will becomeapparent from the following detailed description, which when taken inconjunction with the accompanying figures, which are part of thisdisclosure, and which illustrate by way of example the principles ofthis invention.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1A shows the chemical formula of cellulose;

FIG. 1B illustrates the physical structure of cellulose;

FIG. 2A illustrates the degradation of cellulose fibrils;

FIG. 2B shows the chemical representation of cellulose degradation tocellobiose and glucose;

FIG. 3 shows SDS-PAGE and Zymogram analysis of 2-40 culturesupernatants;

FIG. 4 lists the predicted cellulases of S. degradans 2-40 (thesequences from FIGS. 4-10 are disclosed as SEQ ID NOs 1-214,respectively in order of appearance in the appendix);

FIG. 5 lists the predicted xylanases, xylosidases and relatedaccessories of M. degradans 2-40;

FIG. 6 lists the predicted pectinases and related accessories of S.degradans 2-40;

FIG. 7 lists the arabinanases and arabinogalactanases of S. degradans2-40;

FIG. 8 lists the mannanases of S. degradans 2-40;

FIG. 9 lists the laminarinases of S. degradans 2-40;

FIG. 10 lists selected carbohydrate-binding module proteins of S.degradans 2-40; and

FIG. 11 lists the recombinant proteins of S. degradans 2-40 and acomparison of predicted vs. observed molecular weights thereof.

DETAILED DESCRIPTION

Analysis of the genome sequence of S. degradans 2-40 reveals anabundance of genes coding for enzymes that are predicted to degradeplant-derived carbohydrates. To date, 2-40 is the only sequenced marinebacterium with apparently complete cellulase and xylanase systems, aswell as a number of other systems containing plant-wall activecarbohydrases.

Thus it appears that 2-40 can play a significant role in the marinecarbon cycle, functioning as a “super-degrader” that mediates thebreakdown of CP from various algal, plantal, and invertebrate sources.The remarkable enzymatic diversity, novel surface features (ES), and theapparent localization of carbohydrases to ES make S. degradans 2-40 anintriguing organism in which to study the cell biology of CP metabolismand surface enzyme attachment.

It has now been discovered that 2-40 has a complete complement ofenzymes, suitably positioned, to degrade plant cell walls. This has beenaccomplished by the following approaches: a) annotation and genomicanalysis of 2-40 plant-wall active enzyme systems, b) identification ofenzymes and other proteins which contain domains or motifs that may beinvolved in surface enzyme display, c) the development of testablemodels based on identified protein motifs, and d) cloning and expressionof selected proteins for the production of antibody probes to allowtesting of proposed models of surface enzyme display usingimmunoelectron microscopy.

These efforts have been greatly facilitated by the recent sequencing ofthe genome of 2-40, allowing a strategy where genes which code forproteins with potential involvement in surface attachment may beidentified based on sequence homology with modules or domains known tofunction in surface attachment and/or adhesion.

Enzymatic and non-enzymatic ORFs with compelling sequence elements areidentified using BLAST and other amino acid sequence alignment andanalysis tools. Genes of interest can be cloned into E coli, expressedwith in-frame polyhistidine affinity tag fusions and purified by nickelion chromatography, thus providing the means of identifying andproducing recombinant 2-40 proteins for study and antibody probeproduction.

The genome sequence of 2-40 was recently obtained in conjunction withthe Department of Energy's Joint Genome Initiative (JGI). The finisheddraft sequence dated Jan. 19, 2005 comprises 5.1 Mbp contained in asingle contiguous sequence. Automated annotation of open reading frames(ORFs) was performed by the computational genomics division of the OakRidge National Laboratory (ORNL), and the annotated sequence isavailable on the World Wide Web.

The initial genome annotation has revealed a variety of carbohydrases,including a number of agarases, alginases and chitinases. Remarkably,the genome also contains an abundance of enzymes with predicted roles inthe degradation of plant cell wall polymers, including a number of ORFswith homology to cellulases, xylanases, pectinases, and other glucanasesand glucosidases. In all, over 180 open reading frames with a probablerole in carbohydrate catabolism were identified in the draft genome.

To begin to define the cellulase, xylanase and pectinase systems of2-40, genes were initially classified as belonging to one of thosesystems by BLAST homology. Ambiguous ORFs were tentatively assigned tothe class of the best known hit. Other tools used to refine thistentative classification include Pfam (Protein families database ofalignments and HMMs) and SMART (Simple Modular Architecture Research)which use multiple alignments and hidden Markov models (statisticalmodels of sequence consensus homology) to identify discreet modulardomains within a protein sequence. These analyses were relativelysuccessful; however, a number of ORFs remained difficult to classifybased on sequence homology alone.

Enzymes have traditionally been classified by substrate specificity andreaction products. In the pre-genomic era, function was regarded as themost amenable (and perhaps most useful) basis for comparing enzymes andassays for various enzymatic activities have been well-developed formany years, resulting in the familiar EC classification scheme.Cellulases and other O-Glycosyl hydrolases, which act upon glycosidicbonds between two carbohydrate moieties (or a carbohydrate andnon-carbohydrate moiety—as occurs in nitrophenol-glycoside derivatives)are designated as EC 3.2.1.-, with the final number indicating the exacttype of bond cleaved. According to this scheme an endo-acting cellulase(1,4-β-endoglucanase) is designated EC 3.2.1.4.

With the advent of widespread genome sequencing projects and the ease ofdetermining the nucleotide sequence of cloned genes, ever-increasingamounts of sequence data have facilitated analyses and comparison ofrelated genes and proteins on an unprecedented scale. This isparticularly true for carbohydrases; it has become clear thatclassification of such enzymes according to reaction specificity, as isseen in the E.C. nomenclature scheme, is limited by the inability toconvey sequence similarity. Additionally, a growing number ofcarbohydrases have been crystallized and their 3-D structures solved.

One of the major revelations of carbohydrase sequence and structureanalyses is that there are discreet families of enzymes with relatedsequence, which contain conserved three-dimensional folds that can bepredicted based on their amino acid sequence. Further, it has been shownthat enzymes with the same three-dimensional fold exhibit the samestereospecificity of hydrolysis, even when they catalyze differentreactions (Henrissat, Teeri et al. 1998; Coutinho and Henrissat 1999).

These findings form the basis of a sequence-based classification ofcarbohydrase modules which is available in the form of an internetdatabase, the Carbohydrate-Active enZYme server (CAZy) (Coutinho andHenrissat 1999; Coutinho and Henrissat 1999).

CAZy defines four major classes of carbohydrases, based on the type ofreaction catalyzed: Glycosyl Hydrolases (GH's), Glycosyltransferases(GT's), Polysaccharide Lyases (PL's), and Carbohydrate Esterases (CE's).GH's cleave glycosidic bonds through hydrolysis. This class includesmany familiar polysaccharidases such as cellulases, xylanases, andagarases. GT's generally function in polysaccharide synthesis,catalyzing the formation of new glycosidic bonds through the transfer ofa sugar molecule from an activated carrier molecule, such as uridinediphosphate (UDP), to an acceptor molecule. While GT's often function inbiosynthesis, there are examples where the mechanism is exploited forbond cleavage, as occurs in the phosphorolytic cleavage of cellobioseand cellodextrins (Lou, Dawson et al. 1996). PL's use a β-eliminationmechanism to mediate bond cleavage and are commonly involved in alginateand pectin depolymerization. CE's generally act as deacetylases on O- orN-substituted polysaccharides. Common examples include xylan and chitindeacetylases. Sequence-based families are designated by number withineach class, as is seen with GH5: glycosyl hydrolase family 5. Members ofGH5 hydrolyze β-1,4 bonds in a retaining fashion, using adouble-displacement mechanism which results in retention of the originalbond stereospecificity. Retention or inversion of anomeric configurationis a general characteristic of a given GH family (Henrissat and Bairoch1993; Coutinho and Henrissat 1999). Many examples of endocellulases,xylanases and mannanases belonging to GH5 have been reported,illustrating the variety of substrate specificity possible within a GHfamily. Also, GH5s are predominantly endohydrolases—cleaving chains oftheir respective substrates at random locations internal to the polymerchains. While true for GH5, this generalization does not hold for manyother GH families. In addition to carbohydrases, the CAZy server definesnumerous families of Carbohydrate Binding Modules (CBM). As withcatalytic modules, CBM families are designated based on amino acidsequence similarity and conserved three-dimensional folds.

The CAZyme structural families have been incorporated into a newclassification and nomenclature scheme, developed by Bernard Henrissatand colleagues (Henrissat, Teeri et al. 1998). Traditional gene/proteinnomenclature assigns an acronym indicating general function and order ofdiscovery; in this scheme an organism's cellulase genes are designatedcelA, celB, etc., regardless of their actual mechanism of action oncellulose. Some researchers have attempted to convey more information bynaming cellulases as endoglucanases (engA, engB) or cellobiohydrolases(cbhA, cbhB), however this requires determination of function in vitroand still fails to convey relatedness of protein sequence and structure.CAZyme nomenclature retains the familiar acronym to indicate thefunctional system a gene belongs to and incorporates the family numberdesignation. Capital letters after the family number indicate the orderof report within a given organism system. An example is provided by twoendoglucanases, CenA and CenB, of Cellulomonas fimi. In the oldnomenclature nothing can be deduced from the names except order ofdiscovery. Naming them Cel6A and Cel9A, respectively, makes itimmediately clear that these two cellulases are unrelated in sequence,and so belong to different GH families (where Cel stands for cellulase,and 9 for glycosyl hydrolase family nine). While this scheme does notdistinguish between endo- and exo-activity, these designations are notabsolute and can be included in discussion of an enzyme when relevant(i.e. the cellobiohydrolase Cel6A, the endoxylanase Xyn10B). Catalyticmodules take precedence in naming carbohydrases; since many (or evenmost) carbohydrases contain at least one CBM, they are named for theirenzymatic module. If more than one catalytic domain is present, they arenamed in order from N-terminus to C-terminus, i.e. cel9A-cel48A containsa GH9 at the amino-terminus and a GH48 at the carboxy-terminus. Bothdomains act against cellulose. There are, however, many examples of CBMmodules occurring on proteins with no predicted carbohydrase module. Inthe absence of some other predicted functional domain (like a protease)these proteins are named for the CBM module family. If there aremultiple CBM families present, then naming is again from amino tocarboxy end, i.e. cbm2D-cbm10A (Henrissat, Teeri et al. 1998). Thisnomenclature has been widely accepted and will be used in the naming ofall 2-40 plant-wall active carbohydrases and related proteins consideredas part of this study.

The cell walls of higher plants are comprised of a variety ofcarbohydrate polymer (CP) components. These CP interact through covalentand non-covalent means, providing the structural integrity plantsrequired to form rigid cell walls and resist turgor pressure. The majorCP found in plants is cellulose, which forms the structural backbone ofthe cell wall. See FIG. 1A. During cellulose biosynthesis, chains ofpoly-R-1,4-D-glucose self associate through hydrogen bonding andhydrophobic interactions to form cellulose microfibrils which furtherself-associate to form larger fibrils. Cellulose microfibrils aresomewhat irregular and contain regions of varying crystallinity. Thedegree of crystallinity of cellulose fibrils depends on how tightlyordered the hydrogen bonding is between its component cellulose chains.Areas with less-ordered bonding, and therefore more accessible glucosechains, are referred to as amorphous regions (FIG. 1B). The relativecrystallinity and fibril diameter are characteristic of the biologicalsource of the cellulose (Beguin and Aubert 1994; Tomme, Warren et al.1995; Lynd, Weimer et al. 2002). The irregularity of cellulose fibrilsresults in a great variety of altered bond angles and steric effectswhich hinder enzymatic access and subsequent degradation.

The general model for cellulose depolymerization to glucose involves aminimum of three distinct enzymatic activities (See FIGS. 2A and 2B).Endoglucanases cleave cellulose chains internally to generate shorterchains and increase the number of accessible ends, which are acted uponby exoglucanases. These exoglucanases are specific for either reducingends or non-reducing ends and frequently liberate cellobiose, the dimerof cellulose (cellobiohydrolases). The accumulating cellobiose iscleaved to glucose by cellobiases (β1,4-glucosidases). In many systemsan additional type of enzyme is present: cellodextrinases areβ-1,4-glucosidases which cleave glucose monomers from celluloseoligomers, but not from cellobiose. Because of the variablecrystallinity and structural complexity of cellulose, and the enzymaticactivities required for is degradation, organisms with “complete”cellulase systems synthesize a variety of endo and/or exo-actingβ-1,4-glucanases.

For example, Cellulomonas fimi and Thermomonospora fusca have each beenshown to synthesize six cellulases while Clostridium thermocellum has asmany as 15 or more (Tomme, Warren et al. 1995). Presumably, thevariations in the shape of the substrate-binding pockets and/or activesites of these numerous cellulases facilitate complete cellulosedegradation (Warren 1996). Organisms with complete cellulase systems arebelieved to be capable of efficiently using plant biomass as a carbonand energy source while mediating cellulose degradation. The ecologicaland evolutionary role of incomplete cellulose systems is less clear,although it is believed that many of these function as members ofconsortia (such as ruminal communities) which may collectively achievetotal or near-total cellulose hydrolysis (Ljungdahl and Eriksson 1985;Tomme, Warren et al. 1995).

In the plant cell wall, microfibrils of cellulose are embedded in amatrix of hemicelluloses (including xylans, arabinans and mannans),pectins (galacturonans and galactans), and various β-1,3 and β-1,4glucans. These matrix polymers are often substituted with arabinose,galactose and/or xylose residues, yielding arabinoxylans, galactomannansand xyloglucans—to name a few (Tomme, Warren et al. 1995; Warren 1996;Kosugi, Murashima et al. 2002; Lynd, Weimer et al. 2002). The complexityand sheer number of different glycosyl bonds presented by thesenon-cellulosic CP requires specific enzyme systems which often rivalcellulase systems in enzyme count and complexity. Because of itsheterogeneity, plant cell wall degradation often requires consortia ofmicroorganisms (Ljungdahl and Eriksson 1985; Tomme, Warren et al. 1995).

Objectives—S. degradans and M degradans synthesize complete multi-enzymesystems that degrade the major structural polymers of plant cell walls.A) define cellulase and xylanase systems, determining the activities ofgenes for which function cannot be predicted by sequence homology; andB) genomic identification and annotation of other plant-degrading enzymesystems by sequence homology (i.e. pectinases, laminarinases, etc.).

All publications, patents and patent applications are herein expresslyincorporated by reference in their entirety to the same extent as ifeach individual publication, patent or patent application wasspecifically and indicated individually to be incorporated by referencein its entirety.

The following examples illustrate but are not intended in any way tolimit the scope of the invention.

Experimental Results

I: Genomic, proteomic and functional analyses of 2-40 plant-wall activeenzymes

From the ORNL annotation it is clear that the 2-40 genome containsnumerous enzymes with predicted activity against plant cell wallpolymers. This is particularly surprising since 2-40 is an estuarinebacterium with several complex enzyme systems that degrade common marinepolysaccharides such as agar, alginate, and chitin. Defining multienzymesystems based on automated annotations is complicated by the presence ofpoorly conserved domains and/or novel combinations of domains. There aremany examples of this in the plant-wall active enzymes of 2-40.Accordingly, the ORNL annotations of carbohydrase ORFs were manuallyreviewed with emphasis on the modular composition and then assigned togeneral groups based on the substrate they were likely to be involvedwith (i.e. cellulose or xylan degradation). These genomic sequenceanalyses resulted in a pool of about 25 potential cellulases, 11xylanases and 17 pectinases.

When sequence homology is well-conserved, highly accurate predictions offunction are possible. Therefore, to verify the presence of functioningcellulase and xylanase systems in M degradans, zymograms and enzymeactivity assays were performed as discussed below. Also, attempts weremade to identify enzymes from 2-40 culture supernatants using MassSpectrometry based proteomics.

Next, more sophisticated genomic analyses were used to predict functionwhere possible and to identify ORFs which require functionalcharacterization to determine their roles, if any, in the cellulase andxylanase systems. ORFs which belong to other plant wall-active enzymesystems were tentatively classified based on the sequence analyses andfunctional predictions of B. Henrissat.

To gain insight into the induction and expression of 2-40 cellulases andxylanases, specific activities were determined for avicel andxylan-grown cells and supernatants by dinitrosalicylic acidreducing-sugar assays (DNSA assays), as discussed in the ExperimentalProtocols section at the end of this proposal. Xylanase activity wasmeasured for avicel-grown cultures, and vice versa, in order toinvestigate possible co-induction of activity by these two substrateswhich occur together in the plant cell wall.

Growth on either avicel or xylan yields enzymatic activity against bothsubstrates, suggesting co-induction of the cellulase and xylanasesystems. As with other 2-40 carbohydrase systems, highest levels ofactivity were induced by the homologous substrate. The results alsoreveal some key differences in the expression of these two systems. Whengrown on avicel, cellulase activity is cell-associated in early growthand accumulates significantly in late-stage supernatants. Cell andsupernatant fractions exhibit low levels of xylanase activity thatremain roughly equal throughout all growth phases. In contrast,xylan-grown cultures exhibit the majority of xylanase and cellulaseactivity in the cellular fraction throughout the growth cycle. Cellulaseactivity does not accumulate in the supernatant and xylanase activityaccumulates modestly, but still remains below the cell-bound activity.

Enzyme activity gels (zymograms) of avicel and xylan grown cell pelletsand culture supernatants were analyzed to visualize and identifyexpressed cellulases and xylanases. The zymograms revealed fivexylanolytic bands in xylan-grown supernatants (FIG. 3), four of whichcorrespond well with the calculated MW of predicted xylanases(xyl/arb43G-xyn10D: 129.6 kDa, xyn10E: 75.2 kDa, xyn10C, 42.3 kDa, andxyn11A: 30.4 kDa; see Table 2). Avicel-grown cultures showed eightactive bands with MWs ranging from 30-150 kDa in CMC zymograms. CMC isgenerally a suitable substrate for endocellulase activity. Thesezymograms clearly demonstrate that 2-40 synthesizes a number ofendocellulases of varied size during growth on avicel—indicative of afunctioning multienzyme cellulase system. Together, the CMC and xylanzymograms confirm the results of the genomic analyses and the inducibleexpression of multienzyme cellulase and xylanase systems in M degradans2-40.

To identify individual cellulases and xylanases produced during growthon CP, culture supernatants were subjected to proteomic analysis usingreversed-phase high-performance liquid chromatography (RP-HPLC) coupledwith tandem Mass Spectrometry (MS/MS). The power resulting fromseparating the peptides on the RP-HPLC column prior to electrosprayionization and MS/MS analysis allows the identification of a greatnumber of proteins from complex samples (Smith, Loo et al. 1990;Shevchenko, Wilm et al. 1996; Jonsson, Aissouni et al. 2001). Theseanalyses confidently identified over 100 different non-enzymaticproteins and a number of carbohydrases, including a xylanase, twoxylosidases, a cellulase, and two cellodextrinases. An agarase wasidentified during additional analyses of agarose-grown supernatant.

Gel-slice digestion, extraction, and MS/MS analyses performed at theStanford University Mass Spectrometry facility identified two annotatedcellulases from an avicel-grown supernatant sample. One, designatedcel5H, has a predicted MW of 67 kDa and was identified from a band withan apparent MW of 75 kDa. The other, cel9B, has a predicted MW of 89kDa, but an apparent MW of 120 kDa. The discrepancy between thepredicted and apparent MW of cel9B is consistent with similar instanceswhere certain 2-40 proteins, cloned and expressed in E coli, exhibitapparent MWs which are 30-40% higher than their predicted MW.

The amino acid translations of all gene models in the 2-40 draft genomewere analyzed on the CAZy ModO (Carbohydrase Active enZyme ModularOrganization) server at AFMB-CRNS. This analysis identified all genemodels that contain a catalytic module (GH, GT, PL, or CE) and/or a CBM.In all, the genome contains 222 gene models containing CAZy domains,most of which have modular architecture. Of these, 117 contain a GHmodule, 39 have GTs, 29 PLs, and 17 CE. Many of these carry one or moreCBM from various families. There are also 20 proteins that contain a CBMbut no predicted carbohydrase domain.

Detailed comparisons of 2-40 module sequences to those in the ModOdatabase allowed specific predictions of function for modules where thesequence of the active site is highly conserved. For example, Cel9B(from the gel slice MS/MS) contains a GH9 module which is predicted tofunction as an endocellulase, a CBM2 and a CBM10 module.

When catalytic module sequences are less conserved, only a generalmechanism can be predicted. This is the case with gly5M which contains aGH5 predicted to be either a 1, 3 or 1,4 glucanase—sequence analysiscannot be certain which, and so the acronym designation “gly” forglycanase.

The results of this detailed evaluation and analysis were used to assigngenes to cellulase, xylanase, pectinase, laminarinase, arabinanase andmannanase systems. Each system was also assigned the relevant accessoryenzymes, i.e. cellobiases belong to the cellulase system and xylosidasesbelong to the xylanase system. Genes with less-conserved GH moduleswhich have the most potential to function as cellulases, xylanases oraccessories were identified and designated as needing demonstration offunction.

The results of the ORNL annotation, follow-up annotation analyses,proteomic (mass spectrometry) analyses, CAZyme modular analyses andfunctional predictions have been incorporated into FIGS. 4-11, whichcontain tables that summarize the predicted plant wall activecarbohydrases and selected CBM only genes of 2-40.

The genes chosen for cloning and functional analysis include thecarbohydrases gly3C, gly5K, gly5M, gly9C, and gly43M. Because the activesite of gly5L is highly homologous to that of gly5K, its activity isinferred froM the results obtained from gly5K. Four of the 20 “CBM only”proteins, cbm2A, cbm2B, cbm2C and cbm2D-cbm10A are included in activityassays to investigate their predicted lack of enzymatic function. Thesefour contain CBM2 modules that are predicted to bind to crystallinecellulose. This predicted affinity is the reason for their inclusion inactivity assays; those proteins that bind to cellulose are most likelyto contain cellulase or xylanase modules which were not detected bysequence analysis. With CBM only proteins, a lack of detected enzymeactivity will confirm the absence of a catalytic domain (CD).

In order to define the complete cellulase and xylanase systems of Mdegradans, those enzymes which may belong to the systems but cannot beconfidently assigned based on sequence homology will be expressed,purified and assayed for activity as described in the ExperimentalProtocols. To date, gly3C, gly5K, gly5M, gly9C and gly43M, as well ascbm2A, cbm2B, cbm2C and cbm2D-cbm10A, have been cloned into expressionstrains as pETBlue2 (Novagen) constructs. This vector places expressionunder the control an inducible T7 lac promoter and incorporates aC-terminal 6× Histidine tag, allowing purification of the recombinantprotein by nickel ion affinity. Successful cloning and expression ofthese proteins was confirmed by western blots using α-HisTag® monoclonalantibody (Novagen). All expressed proteins have apparent MWs which areclose to, or larger, than their predicted MW (Table 8) except forCbm2D-Cbm10A which appears to be unstable; two separate attempts toclone and express this protein have resulted in HisTag® containing bandswhich occur near the dye front in western blots, suggesting proteolyticdegradation of this gene product. An additional enzyme, Cel5A, has beencloned and expressed for use as an endocellulase positive control inactivity assays. Cel5A has a predicted MW of 129 kDa, contains two GH5modules, and is highly active in HE-cellulose zymograms.

The major criteria for assigning function will be the substrate actedupon, and the type of activity detected. As such, the various enzymeactivity assays will focus on providing a qualitative demonstration offunction rather than on rigorously quantifying relative activity levels.The assays required are dictated by the substrate being tested, and arediscussed in more detail in Experimental Protocols. For cellulose it isimportant to distinguish between β-1,4-endoglucanase (endocellulase),β-1,4-exoglucanase (cellobiohydrolase), and β-1,4-glucosidase(cellobiase) activities. This will be accomplished using zymograms toassay for endocellulase, DNSA reducing-sugar assays forcellobiohydrolase, and p-nitrophenol-β-1,4-cellobioside (pnp-cellobiose)for cellobiase activity. The combined results from all three assays willallow definition of function as follows: a positive zymogram indicatesendocellulase activity, a negative zymogram combined with a positiveDNSA assay and a negative pnp-cellobiose assay indicates anexocellulase, while a negative zymogram and DNSA with a positivepnp-cellobiose result will imply that the enzyme is a cellobiase.

Xylanase (β-1,4-xylanase), laminarinase (β-1,3-glucanase), and mixedglucanase (β-1,3(4)-glucanase) activity will be determined by xylan,laminarin and barley glucan zymograms, respectively. Unlike cellulose,there do not appear to be any reports of “xylobiohydrolases” or otherexo-acting enzymes which specifically cleave dimers from thesesubstrates. Thus zymograms will suffice for demonstrating depolymerase(endo) activity and pnp-derivatives will detect monosaccharide (exo)cleavage. The pnp-derivatives used in this study will includepnp-α-L-arabinofuranoside, -α-L-arabinopyranoside,-β-L-arabinopyranoside, -β-D-cellobioside, -α-D-xylopyranoside and-β-D-xylopyranoside. These substrates were chosen based on the possibleactivities of the domains in question. The assays will allowdetermination of function for any α- and β-arabinosidases,β-cellobiases, β-xylosidases, bifunctionalα-arabinosidase/β-xylosidases, and α-xylosidases—which cleave α-linkedxylose substituents from xyloglucans. The pnp-derivative assays will berun in 96-well microtiter plates using a standard curve of p-nitrophenolconcentrations, as discussed in Experimental Protocols.

The combination of assays for β-1,4-, β-1,3-, and β-1,3(4)-glucanaseactivities, as well as for β-1,4-xylanase and the variousexo-glycosidase activities should clearly resolve the function of theambiguous carbohydrases. Proteins with demonstrated activity will beassigned to the appropriate enzyme system.

Experimental Protocols

Zymograms

All activity gels were prepared as standard SDS-PAGE gels with theappropriate CP substrate incorporated directly into the separating gel.Zymograms are cast with 8% polyacrylamide concentration and thesubstrate dissolved in dH₂O and/or gel buffer solution to give a finalconcentration of 0.1% (HE-cellulose), 0.15% (barley β-glucan), or 0.2%(xylan). Gels are run under discontinuous conditions according to theprocedure of Laemmli (Laemmli 1970) with the exception of an 8 minutetreatment at 95° C. in sample buffer containing a final concentration of2% SDS and 100 mM dithiothreitol (DTT). After electrophoresis, gels areincubated at room temperature for 1 hour in 80 ml of a renaturing bufferof 20 mM PIPES buffer pH 6.8 which contains 2.5% Triton X-100, 2 mM DTTand 2.5 mM CaCl₂. The calcium was included to assist the refolding ofpotential calcium-binding domains such as the tsp3s of Lam16A.

After the 1 hour equilibration, gels were placed in a fresh 80 mlportion of renaturing buffer and held overnight at 4° C. with gentlerocking. The next morning gels were equilibrated in 80 ml of 20 mM PIPESpH6.8 for 1 hour at room temperature, transferred to a clean container,covered with the minimal amount of PIPES buffer and incubated at 37° C.for 4 hours. Following incubation gels were stained for 30 minutes witha solution of either 0.25% Congo red in dH₂O (HE-cellulose, β-glucan andxylan) or 0.01% Toluidine blue in 7% acetic acid. Gels were destainedwith 1M NaCl for Congo red and dH₂O for Toluidine blue until clear bandswere visible against a stained background.

Nelson-Somogyi Reducing-Sugar Assays

Purified proteins were assayed for activity using a modification of theNelson-Somogyi reducing sugar method adapted for 96-well microtiterplates, using 50 ul reaction volumes (Green, Clausen et al. 1989). Testsubstrates included avicel, CMC, phosphoric-acid swollen cellulose(PASC), Barley glucan, laminarin, and xylan dissolved at 1% in 20 mMPIPES pH 6.8 (Barley glucan and laminarin, 0.5%). Barley glucan,laminarin and xylan assays were incubated 2 hours at 37° C.; avicel, CMCand PASC assays were incubated 36 hours at 37° C. Samples were assayedin triplicate, corrected for blank values, and levels estimated from astandard curve. Protein concentration of enzyme assay samples wasmeasured in triplicate using the Pierce BCA protein assay according tothe manufacture's instructions. Enzymatic activity was calculated, withone unit (U) defined as 1 μM of reducing sugar released/minute andreported as specific activity in U/mg protein.

Exoglycosidase Activity Assays: Pnp-Derivatives

Purified proteins were assayed for activity against pNp derivatives ofα-L-arabinofuranoside, -α-L-arabinopyranoside, -β-L-arabinopyranoside,-β-D-cellobioside, -α-D-glucopyranoside, -β-D-glucopyranoside,-α-D-xylopyranoside and -β-D-xylopyranoside. 25 μl of enzyme solutionwas added to 125 μl of 5 mM substrate solution in 20 mM PIPES pH 6.8,incubated for 30 min at 37° C., and A₄₀₅ was determined. Aftercorrecting for blank reactions, readings were compared to ap-nitrophenol standard curve and reported as specific activities in U/mgprotein, with one unit (U) defined as 1 μmol p-Np/min.

Mass Spectrometry and Proteomic Analyses

Stationary-phase supernatants from avicel, CMC, and xylan-grown cultureswere concentrated to ˜25× by centrifugal ultrafiltration using microconor centricon devices (Millipore). Sample protein concentrations weredetermined by the BCA protein assay. Samples were exchanged into 100 mMTris buffer, pH 8.5, which also contained 8M urea and 10 mM DTT. Sampleswere incubated 2 hours at 37° C. with shaking to denature the proteinsand reduce disulfide bonds. After reduction, 1M iodoacetate was added toa final concentration of 50 mM and the reaction was incubated 30 minutesat 25° C. in the dark. This step alkylates the reduced cysteineresidues, thereby preventing reformation of disulfide bonds. The samplesare then exchanged into 50 mM Tris, 1 mM CaCl₂, pH 8.5 using microcondevices. The denatured, reduced, and alkylated sample is digested intopeptide fragments using proteomics-grade trypsin (Promega) at a 1:50enzyme (trypsin) to substrate (supernatant) ratio. Typical digestionreactions were around 150 μl total volume. Digestions were incubatedovernight at 37° C., stopped by addition of 99% formic acid to a finalconcentration of ˜1% and analyzed by RPHPLC-MS/MS at the UMCP College ofLife Sciences CORE Mass Spectrometry facility.

Peptide fragments were loaded onto a Waters 2960 HPLC fitted with a 12cm microbore column containing C₁₈ as the adsorbent and eluted with alinear gradient of increasing acetonitrile (CH₃CN) concentration into anelectrospray ionization apparatus. The electrospray apparatus ionizedand injected the peptides into a Finnagin LCQ tandem Mass Spectrometer.Automated operating software controlled the solvent gradient andcontinually scanned the eluted peptides. The programidentifies each ofthe three most abundant ion species in a survey scan, isolates each ofthem in the Mass Spectrometer's ion trap and fragments them by inducingcollisions with helium molecules. The resulting sub-fragment masses arerecorded for further analysis by peptide analysis packages like SEQUESTand MASCOT. After the three subscan and collision cycles have completed,the MS takes another survey scan and the cycle repeats until the end ofthe run, usually about three hours. The raw MS reads are used by theanalysis software to generate peptide fragment sequences, which werecompared to amino acid sequence translations of all gene models in the2-40 draft genome. Peptide identity matches were evaluated usingaccepted thresholds of statistical significance which are specific foreach program.

Cloning and Expression of 2-40 Proteins in E coli

The basic cloning and expression system uses pETBlue2 (Novagen) as thevector, E coli DH5α (Invitrogen) as the cloning strain, and E coliBL-21(DE3) Tuner® cells (Novagen) for protein expression strain. Thissystem allows the cloning of toxic or otherwise difficult genes becausethe vector places expression under the control of a T7 lacpromoter—which is lacking in the cloning strain DH5α, thereby abolishingeven low-level expression during plasmid screening and propagation.After the blue/white screen, plasmids are purified from DH5α andtransformed into the expression host (Tuners). The Tuner strain has theT7 lac promoter, allowing IPTG-inducible expression of the vector-codedprotein and lacks the Lon and Omp proteases.

The nucleotide sequences of gene models were obtained from the DOE JGI'sMicrobulbifer degradans genome web server and entered into thePrimerQuest™ design tool provided on Integrated DNA Technologies webpage. The design parameters were Optimum T_(m) 60° C., Optimum PrimerSize 20 nt, Optimum GC %=50, and the product size ranges were chosen sothat the primers were selected within the first and last 100 nucleotidesof each ORF in order to clone as much of the gene as reasonablypossible. The cloning and expression vector, pETBlue2, provides aC-terminal 6× Histidine fusion as well as the start and stop codon forprotein expression. Thus, careful attention to the frame of the vectorand insert sequences is required when adding 5′ restriction sites to thePCR primers. The resulting “tailed primers” were between 26 to 30 ntlong, and their sequences were verified by “virtual cloning” analysisusing the PDRAW software package. This program allows vector and insertDNA sequences to be cut with standard restriction enzymes and ligatedtogether. The amino acid translations of the resulting sequences wereexamined to detect any frame shifts introduced by errors in primerdesign. Following this verification, the primers were purchased fromInvitrogen (Frederick, Md.).

PCR reactions contained 10 pMol of forward and reverse primers, 1 μl of10 mM DNTPs, 1.5 μl of 100 mM MgCl₂, and 1 μl Proof Pro® Pfu Polymerasein a 50 μl reaction with 0.5 μl of 2-40 genomic DNA as the template.PCRs conditions used standard parameters for tailed primers and Pfu DNApolymerase. PCR products were cleaned up with the QIAGEN QIAquick PCRCleanup kit and viewed in 0.8% agarose gels. Following cleanup andconfirmation of size, PCR products and pETBlue2 are digested withappropriate restriction enzymes, usually Ascl and Clal at 37° C. for 1to 4 hours, cleaned up using the QIAquick kit, and visualized in agarosegels. Clean digestions are ligated using T4 DNA ligase for at least 2hours in the dark at room temperature. Ligations are then transformedinto E coli DH5α by electroporation. Transformants are incubated onehour at 37° C. in non-selective media, and then plated onto LB agarcontaining ampicillin and X-gal. As pETBlue2 carries an Amp^(r) gene andinserts are cloned into the lacZ ORF, white colonies contain the insertsequence. White colonies are picked with toothpicks and patched onto anew LB/Amp/X-gal plate, with three of the patched colonies also beingused to inoculate 3 ml overnight broths. Plasmids are prepped frombroths which correspond to patched colonies which remained white afterovernight outgrowth. These plasmid preps are then singly digested withan appropriate restriction enzyme and visualized by agaroseelectrophoresis for size confirmation.

The plasmids are then heat-shock transformed into the Tuner® strain,which carries a chromosomal chloramphenicol resistance gene (Cm^(r)).The Transformants are incubated 1 hour at 37° C. in non-selective rescuemedium, plated on LB agar with Amp and Cm (Tuner medium) and incubatedovernight at 37° C. Any colonies thus selected should contain the vectorand insert. This is confirmed by patching three colonies onto a Tunermedium plate and inoculating corresponding 3 ml overnight broths. Thenext morning the broths are used to inoculate 25 ml broths which aregrown to an OD₆₀₀ of around 0.6 (2-3 hours). At this point a 1 mlaliquot is removed from the culture, pelleted and resuspended in 1/10volume 1×SDS-PAGE treatment buffer. This pre-induced sample is frozen at−20° C. for later use in western blots. The remaining broth is thenamended to 1 mM IPTG and incubated 4 hours at 37° C. Induced pelletsamples are collected at hourly intervals. These samples and thepre-induced control are run in standard SDS-PAGE gels and electroblottedonto PVDF membrane. The membranes are then processed as western blotsusing a 1/5000 dilution of monoclonal mouse α-HisTag® primary antibodiesfollowed by HRP-conjugated goat α-mouse IgG secondary antibodies. Bandsare visualized colorimetrically using BioRad's Opti-4CN substrate kit.Presence of His tagged bands in the induced samples, but not inuninduced controls, confirms successful expression and comparison ofbands from the hourly time points are used to optimize inductionparameters in later, larger-scale purifications.

Production and Purification of Recombinant Proteins

Expression strains are grown to an OD₆₀₀ of 0.6 to 0.8 in 500 ml or 1liter broths of tuner medium. At this point a non-induced sample iscollected and the remaining culture induced by addition of 100 mM IPTGto a final concentration of 1 mM. Induction is carried out for fourhours at 37° C. or for 16 hours at 25° C. Culture pellets are harvestedand frozen overnight at −20° C. for storage and to aid cell lysis.Pellets are then thawed on ice for 10 minutes and transferred topre-weighed falcon tubes and weighed. The cells are then rocked for 1hour at 25° C. in 4 ml of lysis buffer (8M Urea, 100 mM NaH₂PO₄, 25 mMTris, pH 8.0) per gram wet pellet weight. The lysates are centrifugedfor 30 minutes at 15,000 g to pellet cell debris. The cleared lysate(supernatant) is pipetted into a clean falcon tube, where 1 ml of QIAGEN50% Nickel-NTA resin is added for each 4 ml cleared lysate. This mixtureis gently agitated for 1 hour at room temperature to facilitate bindingbetween the Ni⁺² ions on the resin and the His tags of the recombinantprotein. After binding, the slurry is loaded into a disposable minicolumn and the flow thru (depleted lysate) is collected and saved forlater evaluation. The resin is washed twice with lysis buffer that hasbeen adjusted to pH 7.0; the volume of each of these washes is equal tothe original volume of cleared lysate. The flow thru of these two washesis also saved for later analysis in western blots to evaluatepurification efficiency.

At this point the columns contain relatively purified recombinantproteins which are immobilized by the His tags at their C-terminus. Thisis an ideal situation for refolding, so the column is moved to a 4° C.room and a series of renaturation buffers with decreasing ureaconcentrations are passed through the column. The renaturation bufferscontain varying amounts of urea in 25 mM Tris pH 7.4, 500 mM NaCl, and20% glycerol. This buffer is prepared as stock solutions containing 6M,4M, 2M and 1M urea. Aliquots of these can be easily mixed to obtain 5Mand 3M urea concentrations thus providing a descending series of ureaconcentrations in 1M steps. One volume (the original lysate volume) of6M buffer is passed through the column, followed by one volume of 5Mbuffer, continuing on to the 1M buffer—which is repeated once to ensureequilibration of the column at 1M urea. At this point the refoldedproteins are eluted in 8 fractions of 1/10^(th) original volume using 1Murea, 25 mM Tris pH 7.4, 500 mM NaCl, 20% glycerol containing 250 mMimidazole. The imidazole disrupts the Nickel ion-His tag interaction,thereby releasing the protein from the column.

Western blots are used to evaluate the amount of His tagged protein inthe depleted lysate, the two washes, and the eluted fractions. If thereis an abundance of recombinant protein in the depleted lysate and/orwashes it is possible to repeat the process and “scavenge” more protein.Eluate fractions that contain the protein of interest are pooled andthen concentrated and exchanged into storage buffer (20 mM Tris pH 7.4,10 mM NaCl, 10% glycerol) using centricon centrifugal ultrafiltrationdevices (Millipore). The enzyme preparations are then aliquoted andfrozen at −80° C. for use in activity assays.

In various embodiments of this invention, the cellulose degradingenzymes, related proteins and systems containing thereof, of thisinvention, for example including one or more enzymes orcellulose-binding proteins, have a number of uses. Many possible uses ofthe cellulases of the present invention are the same as described forother cellulases in the paper “Cellulases and related enzymes inbiotechnology” by M. K. Bhat (Biotechnical Advances 18 (2000) 355-383),the subject matter of which is hereby incorporated by reference in itsentirety. For examples, the cellulases and systems thereof of thisinvention can be utilized in food, beer, wine, animal feeds, textileproduction and laundering, pulp and paper industry, and agriculturalindustries.

In one embodiment, these systems can be used to degrade cellulose toproduce short chain peptides for use in medicine.

In other embodiments, these systems are used to break down cellulose inthe extraction and/or clarification of fruit and vegetable juices, inthe production and preservation of fruit nectars and purees, in alteringthe texture, flavor and other sensory properties of food, in theextraction of olive oil, in improving the quality of bakery products, inbrewing beer and making wine, in preparing monogastic and ruminantfeeds, in textile and laundry technologies including “fading” denimmaterial, defibrillation of lyocell, washing garments and the like,preparing paper and pulp products, and in agricultural uses.

In some embodiments of this invention, cellulose may be used to absorbenvironmental pollutants and waste spills. The cellulose may then bedegraded by the cellulase degrading systems of the present invention.Bacteria that can metabolize environmental pollutants and can degradecellulose may be used in bioreactors that degrade toxic materials. Sucha bioreactor would be advantageous since there would be no need to addadditional nutrients to maintain the bacteria—they would use celluloseas a carbon source.

In some embodiments of this invention, cellulose degrading enzymesystems can be supplied in dry form, in buffers, as pastes, paints,micelles, etc. Cellulose degrading enzyme systems can also compriseadditional components such as metal ions, chelators, detergents, organicions, inorganic ions, additional proteins such as biotin and albumin.

In some embodiments of this invention, the cellulose degrading systemsof this invention could be applied directly to the cellulose material.For example, a system containing one, some or all of the compoundslisted in FIGS. 4-11 could be directly applied to a plant or othercellulose containing item such that the system would degrade the plantor other cellulose containing item. As another example, 2-40 could begrown on the plant or other cellulose containing item, which would allowthe 2-40 to produce the compounds listed in FIGS. 4-11 in order todegrade the cellulose containing item as the 2-40 grows. An advantage ofusing the 2-40 or systems of this invention is that the degradation ofthe cellulose containing plant or item can be conducted in a marineenvironment, for example under water.

It is one aspect of the present invention to provide a nucleotidesequence that has a homology selected from 100%, 99%, 98%, 97%, 96%,95%, 90%, 85%, 80%, or 75% to any of the sequences of the compoundslisted in FIGS. 4-11

The present invention also covers replacement of between 1 and 20nucleotides of any of the sequences of the compounds listed in FIGS.4-11 with non-natural or non-standard nucleotides for examplephosphorothioate, deoxyinosine, deoxyuridine, isocytosine, isoguanosine,ribonucleic acids including 2-O-methyl, and replacement of thephosphodiester backbone with, for example, alkyl chains, aryl groups,and protein nucleic acid (PNA).

It is another aspect of some embodiments of this invention to provide anucleotide sequence that hybridizes to any one of the sequences of thecompounds listed in FIGS. 4-11 under stringency condition of 1×SSC,2×SSC, 3×SSC1, 4×SSC, 5×SSC, 6×SSC, 7×SSC, 8×SSC, 9×SSC, or 10×SSC.

The scope of this invention covers natural and non-natural alleles ofany one of the sequences of the compounds listed in FIGS. 4-11. In someembodiments of this invention, alleles of any one of any one of thesequences of the compounds listed in FIGS. 4-11 can comprise replacementof one, two, three, four, or five naturally occurring amino acids withsimilarly charged, shaped, sized, or situated amino acids (conservativesubstitutions). The present invention also covers non-natural ornon-standard amino acids for example selenocysteine, pyrrolysine,4-hydroxyproline, 5-hydroxylysine, phosphoserine, phosphotyrosine, andthe D-isomers of the 20 standard amino acids.

Some embodiments of this invention are directed to a method forproducing ethanol from lignocellulosic material, comprising treatinglignocellulosic material with an effective saccharifying amount of oneor more compounds listed in FIGS. 4-11, preferably cellulase cel5Alisted in FIG. 4, to obtain saccharides and converting the saccharidesto produce ethanol. The treating may be conducted in a marineenvironment, such as under water. The one or more compounds listed inFIGS. 4-11 may be present in dry form, in a buffer, or in the form of apaste, paint, or micelle.

Conversion of sugars to ethanol and recovery may be accomplished by, butare not limited to, any of the well-established methods known to thoseof skill in the art. For example, through the use of an ethanologenicmicroorganism, such as Zymomonas, Erwinia, Klebsiella, Xanthomonas, andEscherichia, preferably Escherichia coli K011 and Klebsiella oxytoca P2.

In further aspects of the present invention, the lignocellulosicmaterial is treated with an effective saccharifying amount of all of thecompounds listed in FIGS. 4-11.

In further aspects of the present invention, the one or more compoundslisted in FIGS. 4-11 are from Microbulbifer degradans 2-40.

In further aspects of the present invention, the one or more compoundslisted in FIGS. 4-11 are in a system consisting essentially of one ormore compounds listed in FIGS. 4-11 or a system further comprising metalions, chelators, detergents, organic ions, inorganic ions, or one ormore additional proteins, such as biotin and/or albumin.

Some embodiments of this invention are directed to ethanol produced bytreating lignocellulosic material with an effective saccharifying amountof one or more compounds listed in FIGS. 4-11 to obtain saccharides andconverting the saccharides to produce ethanol. Conversion of sugars toethanol and recovery may be accomplished by, but are not limited to, anyof the well-established methods known to those of skill in the art. Forexample, through the use of an ethanologenic microorganism, such asZymomonas, Erwinia, Klebsiella, Xanthomonas, and Escherichia, preferablyEscherichia coli K011 and Klebsiella oxytoca P2.

Further embodiments of this invention are directed to a method forproducing ethanol from lignocellulosic material, comprising contactinglignocellulosic material with a microorganism expressing an effectivesaccharifying amount of one or more compounds listed in FIGS. 4-11,preferably cellulase cel5A listed in FIG. 4, to obtain saccharides andconverting the saccharides to produce ethanol. The contacting may beconducted in a marine environment, such as under water. Themicroorganism may be Microbulbifer degradans 2-40 or a recombinantmicroorganism containing a chimeric gene comprising at least onepolynucleotide encoding a polypeptide comprising an amino acid sequenceof at least one of the compounds listed in FIGS. 4-11; wherein the geneis operably linked to regulatory sequences that allow expression of theamino acid sequence by the microorganism. The recombinant microorganism,may be a bacteria or yeast, such as Escherichia coli. In some aspects ofthe present invention, the recombinant microorganism is an ethanologenicmicroorganism, such as microorganisms from the species Zymomonas,Erwinia, Klebsiella, Xanthomonas, or Escherichia, preferably Escherichiacoli K011 or Klebsiella oxytoca P2.

Further aspects of the present invention are directed to ethanolproduced by contacting lignocellulosic material with a microorganismexpressing an effective saccharifying amount of one or more compoundslisted in FIGS. 4-11 to obtain saccharides and converting thesaccharides to produce ethanol.

A further aspect of the invention is directed to a method for producingethanol from lignocellulosic material, comprising contactinglignocellulosic material with an ethanologenic microorganism expressingan effective saccharifying amount of one or more compounds listed inFIGS. 4-11 to produce ethanol. The ethanologenic microorganism expressesan effective amount of one or more compounds listed in FIGS. 4-11 tosaccharify the lignocellulosic material and an effective amount of oneor more enzymes or enzyme systems which, in turn, catalyze (individuallyor in concert) the conversion of the saccharides (e.g., sugars such asxylose and/or glucose) to ethanol. The one or more enzymes or enzymesystems of the ethanologenic organism may be expressed naturally or by,but not limited to, any of the methods known to those of skill in theart. For example, release of the one or more enzymes or enzyme systemsmay be obtained through the use of ultrasound. In some aspects of thepresent invention, the ethanologenic microorganism is transformed inorder to be able to express one or more of the compounds listed in FIGS.4-11. In some aspects of the present invention, the ethanologenicmicroorganism is from the species Zymomonas, Erwinia, Klebsiella,Xanthomonas, or Escherichia, preferably Escherichia coli K011 orKlebsiella oxytoca P2.

It is to be understood that while the invention has been described aboveusing specific embodiments, the description and examples are intended toillustrate the structural and functional principles of the presentinvention and are not intended to limit the scope of the invention. Onthe contrary, the present invention is intended to encompass allmodifications, alterations, and substitutions within the spirit andscope of the appended claims.

REFERENCES CITED

-   Andrykovitch, G. and I. Marx (1988) “Isolation of a new    polysaccharide-digesting bacterium from a salt marsh.” Applied and    Environmental Microbiology 54: 3-4.-   Beguin, P. and J. P. Aubert (1994) “The biological degradation of    cellulose.” FEMS Microbiol Rev 13(1): 25-58.-   Chakravorty, D. (1998). Cell Biology of Alginic Acid degradation by    Marine Bacterium 2-40. College Park, University of Maryland.-   Coutinho, P. M. and B. Henrissat (1999) Carbohydrate-active enzyme    server. Accessed Jan. 21, 2004-   Coutinho, P. M. and B. Henrissat (1999) The modular structure of    cellulases and other carbohydrate-active enzymes: an integrated    database approach. Genetics, biochemistry and ecology of cellulose    degradation. T. Kimura. Tokyo, Uni Publishers Co: 15-23.-   Distel, D. L., W. Morrill, et al. (2002) “Teredinibacter turnerae    gen. nov., sp. nov., a dinitrogen-fixing, cellulolytic,    endosymbiotic gamma-proteobacterium isolated from the gills of    wood-boring molluscs (Bivalvia: Teredimidae).” Int J Syst Evol    Microbiol 52(6): 2261-2269.-   Ensor, L., S. K. Stotz, et al. (1999) “Expression of multiple    insoluble complex polysaccharide degrading enzyme systems by a    marine bacterium.” J Ind Microbiol Biotechnol 23: 123-126.-   Gonzalez, J. and R. M. Weiner (2000) “Phylogenetic characterization    of marine bacterium strain 2-40, a degrader of complex    polysaccharides.” International journal of systematic evolution    microbiology 50: 831-834.-   Henrissat, B. and A. Bairoch (1993) “New families in the    classification of glycosyl hydrolases based on amino acid sequence    similarities.” Biochem J 293 (Pt 3): 781-8.-   Henrissat, B., T. T. Teeri, et al. (1998) “A scheme for designating    enzymes that hydrolyse the polysaccharides in the cell walls of    plants.” FEBS Lett 425(2): 352-4.-   Jonsson, A. P., Y. Aissouni, et al. (2001) “Recovery of    gel-separated proteins for in-solution digestion and mass    spectrometry.” Anal Chem 73(22): 5370-7.-   Kelley, S. K., V. Coyne, et al. (1990) “Identification of a    tyrosinase from a periphytic marine bacterium.” FEMS Microbiol Lett    67: 275-280.-   Kosugi, A., K. Murashima, et al. (2002) “Characterization of two    noncellulosomal subunits, ArfA and BgaA, from Clostridium    cellulovorans that cooperate with the cellulosome in plant cell wall    degradation.” J Bacteriol 184(24): 6859-65.-   Laemmli, U. K. (1970). “Cleavage of structural proteins during the    assembly of the head of the bacteriophage T4.” Nature 277: 680-685.-   Ljungdahl, L. G. and K. E. Eriksson (1985) Ecology of Microbial    Cellulose Degradation. Advances in Microbial Ecology. New York,    Plenum Press. 8: 237-299.-   Lou, J., K. Dawson, et al. (1996) “Role of phosphorolytic cleavage    in cellobiose and cellodextrin metabolism by the ruminal bacterium    Prevotella ruminicola.” Appl. Environ. Microbiol. 62(5): 1770-1773.-   Lynd, L. R., P. J. Weimer, et al. (2002) “Microbial cellulose    utilization: fundamentals and biotechnology.” Microbiol Mol Biol Rev    66(3): 506-77, table of contents.-   Shevchenko, A., M. Wilm, et al. (1996) “Mass spectrometric    sequencing of proteins silver-stained polyacrylamide gels.” Anal    Chem 68(5): 850-8.-   Smith, R. D., J. A. Loo, et al. (1990) “New developments in    biochemical mass spectrometry: electrospray ionization.” Anal Chem    62(9): 882-99.-   Stotz, S. K. (1994). An agarase system from a periphytic prokaryote.    College Park, University of Maryland.-   Sumner, J. B. and E. B. Sisler (1944) “A simple method for blood    sugar.” Archives of Biochemistry 4: 333-336.-   Tomme, P., R. A. Warren, et al. (1995) “Cellulose hydrolysis by    bacteria and fungi.” Adv Microb Physiol 37: 1-81.-   Warren, R. A. (1996) “Microbial hydrolysis of polysaccharides.” Annu    Rev Microbiol 50: 183-212.-   Whitehead, L. (1997). Complex Polysaccharide Degrading Enzyme Arrays    Synthesized By a Marine Bacterium. College Park, University of    Maryland.

MEGA

We claim:
 1. A method for producing ethanol from lignocellulosicmaterial, comprising contacting lignocellulosic material with amicroorganism expressing an effective saccharifying amount of cel5A,xyl/arb43G-xyn10D, xyn1-E, xyn10C or xyn11A to obtain saccharides andproviding an ethanologenic organism to convert the saccharides toproduce ethanol.
 2. The method according to claim 1, wherein themicroorganism is Microbulbifer degradans 2-40.
 3. The method of claim 1,wherein the microorganism is a recombinant microorganism containing achimeric gene comprising at least one polynucleotide encoding apolypeptide comprising an amino acid sequence of at least one of thecompounds listed in FIGS. 4-11; wherein the gene is operably linked toregulatory sequences that allow expression of the amino acid sequence bythe microorganism.
 4. The method of claim 3, wherein the recombinantmicroorganism is a bacteria or yeast.
 5. The method of claim 3, whereinthe recombinant microorganism is Escherichia coli.
 6. The method ofclaim 4, wherein the ethanologenic organism is yeast.
 7. The method ofclaim 1, wherein the contacting is conducted in a marine environment. 8.The method of claim 7, wherein the contacting is conducted under water.9. The method of claim 1, wherein the one or more compounds listed inFIGS. 4-11 is cellulase cel5A SEQ ID NO:1.
 10. The method of claim 1,wherein the microorganism expressing an effective saccharifying amountof cel5A, xyl/arb43G-xyn10D, xyn1-E, xyn10C or xyn11A and theethanologenic organism are the same.